Uncovering myocardial infarction genetic signatures using GWAS exploration in Saudi and European cohorts

Genome-wide association studies (GWAS) have yielded significant insights into the genetic architecture of myocardial infarction (MI), although studies in non-European populations are still lacking. Saudi Arabian cohorts offer an opportunity to discover novel genetic variants impacting disease risk due to a high rate of consanguinity. Genome-wide genotyping (GWG), imputation and GWAS followed by meta-analysis were performed based on two independent Saudi Arabian studies comprising 3950 MI patients and 2324 non-MI controls. Meta-analyses were then performed with these two Saudi MI studies and the CardioGRAMplusC4D and UK BioBank GWAS as controls. Meta-analyses of the two Saudi MI studies resulted in 17 SNPs with genome-wide significance. Meta-analyses of all 4 studies revealed 66 loci with genome-wide significance levels of p < 5 × 10–8. All of these variants, except rs2764203, have previously been reported as MI-associated loci or to have high linkage disequilibrium with known loci. One SNP association in Shisa family member 5 (SHISA5) (rs11707229) was evident at a much higher frequency in the Saudi MI populations (> 12% MAF). In conclusion, our results replicated many MI associations, whereas in Saudi-only GWAS (meta-analyses), several new loci were implicated that require future validation and functional analyses.


Patient sampling and phenotyping
Saudi MI Study 1 From 2019 to 2020, samples and data from consecutive subjects with MI visiting the Cardiology Clinics, King Fahd Hospital of the University, Al-Khobar, and King Fahd Hospital, Alhafof, Saudi Arabia, were collected for inclusion in this study.Participants ranged in age from 25 to 66 and were clinically diagnosed with MI at the time of recruitment.Clinical diagnosis of MI was derived according to the fourth universal definition of MI 19 .The phenotypic data of all subjects were reviewed by a cardiologist consultant to verify uniformity among sites and eligibility according to study criteria.Eligibility for each of the individual cases was reviewed by the consultant committee and assessed for inclusion.For secondary analyses, T2D and hypertension were defined using WHO criteria; LDL, HDL, total cholesterol and troponin I were determined using Direct LDL-, Ultra HDL-, Cholesterol-and STAT High Sensitive Troponin I-Alinity c Reagent kits (Abbott, Wiesbaden, Germany) 20,21 .

Saudi MI Study 2
Details of the MI patients and controls in this Saudi study are described in a 2016 GWAS of CAD/MI by Wakil et al. 22 .Patients with suspected CAD/MI based on coronary angiography and echocardiography (ECG) abnormalities at the Catheterization Centre of King Faisal Heart Institute, King Faisal Specialist Hospital and Research Centre, Riyadh (KFSHRC), Saudi Arabia, were evaluated and represented all five regions of the country.Changes in the biomarkers myoglobin, cardiac troponin T, pro-brain natriuretic peptide and pro-calcitonin were also assessed.Two experienced interventional cardiologists independently reviewed patient records for the presence of ischaemia as per recommendations of the Joint ESC/ACCF/AHA/WHF Task Force for the Redefinition of MI 23 .The exclusion criteria included major cardiac rhythm disturbances, history of cerebral vascular disease, neurological disorder, psychiatric illness, and substance abuse.Controls consisted of individuals from KFSHRC undergoing heart valvular disease surgery and subjects with chest pain but no significant coronary stenosis based on angiography.There were 3481 MI patients available after delineating MI from CAD-alone cases, with 2299 controls.
Details regarding the UK Biobank and CARDIoGRAMplusC4D Consortium GWAS MI patients (56,278 subjects), controls (577,716 non-MI subjects), phenotype ascertainment, and ancestry information are described elsewhere 9 .The study design for these analyses and details of how the datasets were combined is also depicted in the flowchart shown in Fig. 1.
For the Saudi MI Study 1, ethical approval was obtained from the Imam Abdulrahman Bin Faisal University Institutional Review Board (IRB) committee (IRB-2019-01-104), and the study was conducted according to the ethical principles of the Declaration of Helsinki and Good Clinical Practice guidelines.Informed written consent in English, with a verified translation in Arabic, was obtained from all participants in accordance with the IRB rules.The Saudi MI Study 2 protocol was approved by the Institutional Review Board (IRB) of the King Faisal Specialist Hospital and Research Centre.Summary-level GWAS datasets for the UK BioBank and CardioGRAM-plusC4D were downloaded through a resource database outlined in Hartiala et al. 9 .

Saudi MI Study 1
Peripheral blood samples were collected in EDTA tubes and stored at 4 °C before extraction of genomic DNA using Gentra Puregene Blood kits (Qiagen, Maryland, USA) according to the manufacturer's protocol.DNA concentrations and purity were estimated by fluorometry using a NanoDrop 2000 Spectrophotometer (Thermo Fisher, MA, USA) and were diluted to 20 ng/µl.GWG was then performed using the Infinium Global Screening Array v3.0 (Illumina, CA, USA), which captures 654,027 SNPs or monomorphic/rare variants.Genotype data were clustered using Illumina GenomeStudio software, and standard quality control (QC) was performed using PLINK 24 .Normalized intensities for all samples were generated using optiCall clustering 25 .Raw genotypes were imputed using the 1000 Genomes Project (1KGP) v3 multiethnic reference panel through the Michigan Imputation Server 26 .The genotype data were subjected to QC with variants with < 90% missingness and consistency against the Haplotype Reference Consortium (HRC) reference panel for strand, reference/alternative alleles, SNP names and genome build positions.Furthermore, the imputed data were subjected to QC to retain variants with imputation INFO scores of R 2 > 0.3 using Minimac, a 99% genotyping and sample call rate, and minor allele frequency (MAF) > 0.01 27 .Variants with a Hardy-Weinberg equilibrium (HWE) p value < 1 × 10 −8 were www.nature.com/scientificreports/excluded from the analyses.Principal component analyses (PCA) were computed using the fastPCA module in the eigensoft package 28 .The data points were then projected on the 1KGP populations 29 .
Saudi MI Study 2 DNA, GWG and QC are described in detail in Wakil et al. 22 .In brief, GWG was performed using Affymetrix Axiom Genome-Wide "ASI Array" (Asian population) with ~ 537,800 directly genotyped SNPs passing QC filtering.CARDIoGRAMplusC4D and UKBioBank GWAS data and imputation are fully described in Hartiala et al. 9 .This data was also imputed to 1000 Genomes dataset using Michigan Imputation server 26 .

Statistical analyses
Meta-analyses of GWAS: The variants passing QC for imputed dosage data were used to perform genome-wide association analyses for MI patients and controls.To account for the relatedness in the dataset, the analyses for Saudi studies 1 and 2 were performed using REGENIE 30 .Supplementary Fig. 1 illustrates the Manhattan and QQ plot for Saudi study 1 GWAS analyses.The associations were adjusted for age, sex, and the first 4 principal components.Two GWAS meta-analyses were performed to discover MI loci..The loci for these SNPs are linked to the genes RNF13 (rs41411047), PDZD2 (rs32793), ITGA1 (rs16880442), CDKN2A/B (rs2891168, rs10757274 and rs1333045), EIF4A3 (rs7211079), KCNE2 (rs998261), NDST2 (rs4691), and MRPS6 (rs28451064).

Study population characteristics
We also assessed the replication of 213 SNPs with genome-wide significance from the CARDIoGRAM-plusC4D + UKBiobank meta-analysis by Hartiala et al. 9 .Three out of 213 SNPs from the Hartiala et al. study demonstrated replication in Saudi data 1 and data 2 meta-analyses 9 .Figure 2 also shows the three SNPs that were replicated from the 213 genome-wide significant SNPs from the Hartiala et al. 9 meta-analysis.SNPs were considered significant for inclusion if they passed the Bonferroni calculation (p ≤ 0.05/213 = 0.0002).

GWAS meta-analyses of Saudi datasets + CardiogramplusC4D + UkBioBank
Figure 3 shows a Manhattan plot for 2523 association signals corresponding to 66 loci (mapping to 212 genes) observed above genome-wide significance (p < 5 × 10-8).The summary statistics of the Saudi MI Study 1 and 2 plus CARDIoGRA MplusC4D + UKBiobank GWAS for p < 0.001 are shown in Supplementary Table 3.The difference in the allele frequencies for all variants in these 66 loci among European and Saudi populations is reported in Supplementary Table 4. Fifteen variants showed a > 10% difference in allele frequencies, but the www.nature.com/scientificreports/majority of the variants were common (> 10% MAF) in both populations.Notably, rs11707229 in SHISA5 has an MAF of 0.02 in European populations but an MAF of 0.12 in our Saudi MI populations.The results for all 66 significant genome-wide loci are reported in Table 2. Sixty-five out of 66 loci have been previously implicated to be significantly associated with MI based on the GWAS catalogue (downloaded on April 27, 2023).rs2764203 was previously identified to be nominally associated with MI (p = 1.0 × 10 −7 ) but was found to be significantly associated with MI after the addition of the Saudi data in the meta-analyses (p = 2 × 10 -8 ).

Discussion
We performed GWG, imputation and GWAS on two independent Saudi Arabian studies comprising a total of 3950 MI patients and 2324 non-MI controls.Meta-analyses were performed with the two Saudi MI studies separately, resulting in 6 loci with genome-wide significance, and then combined with the CardioGRAMplusC4D and UK BioBank GWAS SNRPC studies, resulting in 66 loci with genome-wide significance.Our results replicated many MI associations, whereas in Saudi-only GWAS (meta-analyses), several new loci were implicated that require future validation and functional analyses.
The new genome-wide signal for MI from the meta-analyses of the four MI studies, rs2764203, is located approximately 4 kb from RP3-375P9.2 and ~ 20 kb from small nuclear ribonucleoprotein polypeptide C (SNRPC).Very little information is available from any previous studies of the long noncoding RNA RP3-375P9.2,apart from an association in a hepatocellular carcinoma (HCC) genomic and epigenomics study within early-and late-stage patients 32 .The RP3-375P9.2lncRNA does not appear to be associated with MI in a recent pathway-based study 33 .
Small nuclear ribonucleoprotein polypeptide C (SNRPC) encodes one of the specific protein components of the U1 small nuclear ribonucleoprotein (snRNP) particle, which is needed for the formation of the spliceosome 34,35 .It is critical to the initiation and regulation of pre-mRNA splicing and is broadly expressed in most tissues, including heart tissues 36 .A recent study by Zhang et al. showed that SNRPC has the potential to promote the motility of hepatocellular carcinoma (HCC) cells via induction of epithelial-mesenchymal transition and to serve as a prognostic biomarker in HCC and predictor of immunotherapy responses 37,38 .SNRPC has also been shown to impact sex biases in systemic autoimmune diseases 39 .
The Shisa family member 5 (SHISA5) intronic association (rs11707229) in this MI study is interesting, as the observed minor allele frequency was > 12% in our overall Saudi population but has been reported to be approximately 2% in European populations, less than 1% in African populations and very rare in most Asian populations (http:// www.ncbi.nlm.nih.gov/ snp/ rs117 07229).SHISA5 is a member of the Shisa family, which is a single-transmembrane protein characterized by N-terminal cysteine-rich domains and proline-rich C-terminal regions.SHISA5 is located in the endoplasmic reticulum and the nuclear membrane and appears to have roles in numerous biological processes including regulation of autophagy, with involvement in p53-inducible proapoptosis in a caspase-dependent manner, is inducible by interferon and has an effect on the Wnt signalling pathway [40][41][42][43] .Associations of SHISA5 to date are largely limited to anthropometric, red cell characteristics and the glomerular filtration rate (GFR) [44][45][46] .Lakota and colleagues have previously described the upregulation of SHISA5 in mesenchymal stem cells (MSCs) transplanted into human subjects with ischaemic cardiomyopathy and controls and postulated that SHISA5 contributes to the death of cardiomyocytes via apoptosis after  www.nature.com/scientificreports/ischaemia-reperfusion injury 47,48 .Alternative splicing isoforms of different C-terminal isoforms of Shisa5 have been previously reported, and numerous variants impacting alternative splicing acceptor or donor sites appear likely to affect the specificity of its interactions 41 .
In conclusion, our study not only successfully replicated many known MI associations but also, through our Saudi-specific GWAS meta-analyses, identified several novel loci.These newly implicated loci, including RP3-375P9.2lncRNA and the SNRPC gene, present exciting opportunities for future validation and functional analyses.Moreover, the association with SNPs in SHISA5, considering the distinct minor allele frequency differences between Saudi and European populations, offers potential insights into the high MI prevalence in Saudi Arabia.Such findings emphasize the critical need for genetic studies across diverse ancestral cohorts to ensure a holistic understanding of MI.This study has numerous limitations, including a limited number of MI controls, discordance in hypertension prevalence between the two Saudi MI studies and incomplete BMI measurements for a small number of the study subjects.Consanguineous populations such as the Saudi Arabian population offer an invaluable opportunity to explore rare and structural variants that are linked to disease.Future studies will involve more elegant methodologies to enhance the power of GWAS in consanguineous populations, inclusion of modifiable and nonmodifiable risk factors in predicting the risk of common diseases and strategic tools to analyse multiple genetic variants and exposure variables to uncover the hidden heritability of MI and concomitant comorbidities.

Figure 2 .
Figure 2. Meta-analysis overview of Saudi MI Study 1 and 2 plus CARDIoGRAMplusC4D + UKBiobank GWAS: Synthesis view plot showing p values from the four analyses in the first panel and their odds ratio and confidence intervals for: Saudi MI Study 2 (Panel 2, blue); CARDIoGRAMplusC4D + UKBioBank (Panel 3, red); Saudi MI Study 1 + 2 (Panel 4, green) and Saudi MI Study 1 + 2 and CARDIoGRAMplusC4D + UKBioBank (Panel 5, yellow).The 10 replicated SNPs are shown on the y-axis.

Figure 3 .
Figure 3. (A) Manhattan plot for MI genome-wide significant signals for the full meta-analysis comprising 3950 Saudi MI patients and 2324 controls and 56,278 MI patients and 577,716 controls from CARDIoGRAMplusC4D + UKBiobank.(B) Quantile-Quantile (Q-Q) plot for the meta-analyses (genomic inflation factor λ = 1.203).The horizontal red line indicates genome-wide significance (p value ≤ 5 × 10-8).SNPs coloured green have not been identified in previous studies.

Table 1
summarizes the demographic characteristics of the two Saudi cohorts included in this study.In both cohorts, there were more subjects with MI represented compared to controls having no MI.Saudi MI Study 1 Figure1.This flowchart provides a visual representation of the study design, detailing the progression from participant recruitment to statistical analyses.

Table 1 .
Demographics of the two Saudi cohorts included in the MI meta-analysis.included469patients(95%) and 25 controls (5%), whereas Saudi MI Study 2 included 3481 (60%) patients and 2299 controls (40%).Overall, there were more men than women represented in the study; the male to female ratio in both cohorts was ~ 70% to 30%.Both sexes were equally represented in the control group of Study 2. Study 1 had a balanced median age of 55(47, 63)years for the patients and 54(44, 64) years for the controls, while Study 2 was represented by a larger distribution of ages with a median age of 60 (51, 69) for patients and 48(35, 59) for controls.BMI measurements were not available in 4-10% of study subjects, but of those measured, the median BMI was slightly higher in Study 1 {29.3 (25.8, 32.7) for the patients and 30.1(27.4,35.3)for the controls} than in Study 2 {28.9 (25.6, 32.5) for the patients and 28.6(24,5, 33.4)for the controls}.In Study 2, the patients with MI had much higher counts of hypertension (81%) than those in Study 1 (33%).Meta-analyses of 3950 MI patients and 2324 controls from Saudi MI Study 1 and 2 resulted in 17 SNPs (6 loci) reaching genome-wide significance.The Manhattan plot for Saudi data meta-analyses is shown in Supplementary Fig.2.Supplementary Table1shows the Quality control and Quality assurance metrics for the SNP filtering for: the two Saudi MI studies.The meta-analysis summary statistics of Study 1 and 2 signals for p < 0.001 are shown in Supplementary Table2.We tested for replication of eight MI-associated SNPs from the Wakil et al. original GWAS paper from which Study 2 cases and controls were derived, of which 3 SNPs were of genome-wide significance and 5 additional SNPs had a suggestive p value of < 1 × 10 -522 .Seven out of eight SNPs from Wakil et al. were replicated in this study at the Bonferroni threshold (p value ≤ 0.05/8 = 0.006) BMI 29.3 (25.8, 32.8) 20.1 (27.4,35.3) 28.9 (25.4,35.3) 28.6 (24.5, 33.4) Vol:.(1234567890)Scientific Reports | (2023) 13:21866 | https://doi.org/10.1038/s41598-023-49105-1www.nature.com/scientificreports/ ). SNPs coloured green have not been identified in previous studies.

Table 2 .
The resulting 66 genomic risk loci from GWAS meta-analyses across 60,228 MI patients and 580,040 non-MI controls from Saudi MI Study 1 & 2, the CardioGRAMplusC4D and the UK BioBank.*nGWAS SNPs refer to the number of GWAS significant SNPs in the loci, and nLead SNPs refer to the number of independent Lead SNPs in the loci.The chromosome start-end positions for risk loci are shown in the locus column.Previously reported SNPs were identified using the LD trait tool, and the results from the EBI GWAS catalogue downloaded on 04/05/2023 were used.